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Abstract — Full-rate space time codes (STC) with rate = 
number of transmit antennas have high multiplexing gain, but 
high decoding complexity even when decoded using reduced- 
complexity decoders such as sphere or QRDM decoders. In 
this paper, we introduce a new code property of STC called 
block-orthogonal property, which can be exploited by QR- 
decomposition-based decoders to achieve significant decoding 
complexity reduction without performance loss. We show that 
such complexity reduction principle can benefit the existing 
algebraic codes such as Perfect and DjABBA codes due to 
their inherent (but previously undiscovered) block-orthogonal 
property. In addition, we construct and optimize new full-rate 
BOSTC (Block-Orthogonal STC) that further maximize the 
QRDM complexity reduction potential. Simulation results of bit 
error rate (BER) performance against decoding complexity show 
that the new BOSTC outperforms all previously known codes as 
long as the QRDM decoder operates in reduced-complexity mode, 
and the code exhibits a desirable complexity saturation property. 

Index Terms — Space-time codes (STC), orthogonal STC, quasi- 
orthogonal STC, block-orthogonal STC, QRD-M algorithm, de- 
coding complexity. 

I. Introduction 

BEcause of their simple maximum-likelihood (ML) de- 
coding, space-time codes (STC) with pure orthogonal 
property have received considerable attention in the past 
decade [ 1 1-[4|. However, the code rates of orthogonal STC are 
mostly low||5l. To increase the code rates, pure orthogonality 
has been relaxed to quasi-orthogonality for STC in ll6l- fl5l . 

To pursue high transmission rates, high-rate STC such as 
Bell Labs layered space-time (BLAST) [16|, double space- 
time transmit diversity (D-STTD) code H7), DjABBA code 
Ifl8l and algebraic STC ED ED have been developed, but 
they demand a high maximum-likelihood (ML) decoding 
complexitjQ due to the non-orthogonal code structure. In 
order to reduce the decoding complexity of existing algebraic 
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'in this paper, decoding complexity represents the number of likelihood 
function calculations per symbol duration in a decoding process. 



STC, fast-decodable structure is proposed in Ell , however, 
the associated complexity reduction is upper bounded by the 
maximum code rate of (quasi-)orthogonal STC l22l . and hence 
is limited. 

Basically, quasi-orthogonality and fast-decodability in STC 
imply additional zero entries in the upper triangle matrix after 
QR decomposition of the equivalent channel matrices, these 
zero entries are exploited in breadth-first search or depth- 
first search decoders such as QRDM and sphere decoders 
to achieve decoding complexity reduction. In this paper, we 
introduce a new property for full rate STC, called block- 
orthogonal property, and propose further QRDM complexity 
reduction for codes with such property. The proposed decoding 
principle can benefit many existing algebraic codes due to 
their previously undiscovered block-orthogonal property. For 
example, D-STTD code and DjABBA code have about 50% 
decoding complexity reduction. Moreover, we design new 
full-rate codes called block orthogonal STC (BOSTC) that 
further exploit the block-orthogonal property for complexity 
reduction in QRDM decoders. Besides the usual bit error 
rate (BER) against signal-to-noise ratio (SNR)investigation 
approach, we also adopt a new approach: BER comparison 
against decoding complexity, which gives interesting new 
insights into codes which are optimal with respect to specific 
decoding complexity levels. 

The rest of this paper is organized as follows. System model 
is presented in Section [TTJ Block-orthogonal property and 
BOSTC are introduced and studied in Section [III] In Section 
IIVI the benefit of block-orthogonal property is described and 
simulated. New BOSTC for arbitrary transmit antenna number 
are constructed and optimized in Section [V] The bit error rate 
(BER) performance simulations are provided in Section [VI] 
This paper is concluded in Section I VII I 

In what follows, bold lower case and upper case letters de- 
note vectors and matrices (sets), respectively; R and C denote 
the real and the complex number field, respectively; (-) R and 
(•) 7 stand for the real and the imaginary part of a complex 
vector or matrix, respectively; [■] , [•] , |-| and rank(-) denote 
the transpose, the complex conjugate transpose, the Frobenius 
norm and the rank of a matrix, respectively; [ay] denotes a 
matrix with the z-th row and the j-th column element a^ . 

II. System Model 

A. Signal Model 

We consider a space-time coded multi-input multi-output 
(MIMO) system employing N t transmit antennas and N r 
receive antennas. Let the transmitted signal sequences be 
partitioned into independent time block, denoting as {sx,S2, 
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•• ,sl} where s/ are real-valued information symbol^] for 
transmission. To transmit {si,S2,--- ,sl} from N t transmit 
antennas over T symbol durations, an STBC matrix X g 
£TxN t j s (i es ig nec i following the signal model in 11231 : 

L 



i=i 



where C/ G C ' (I — ,L) are called dispersion 

matrices. The code rate is considering complex symbol 
transmission, and the average energy of the code matrix X is 
constrained to £ x = E||X|| 2 = T. 

The received signals y tm of the mth (m = 1, ••• ,N r ) 
receive antenna at time t (t = 1, • • • , T) can be arranged 
in a T x 7V r matrix Y = [f 1 y 2 ■ • • y^y ] . Thus, the transmit- 
receive signal relation can be represented as: 



Y = VpXH + Z 



(2) 



where HN t xN r = hi h 2 • • • h.N r is the channel coefficient 
matrix. We often assume that the communication channel is 
quasi-static Rayleigh fading with coefficient of independently, 
identically distributed (i.i.d.) CAf(0, 1) entries; Z>TxN r — 
[zi z 2 • • • z jv r ] = [ztm] is the additive white Gaussian noise 
(AWGN) matrix where the entries z tm are independently, 
identically distributed (i.i.d.) C7V(0, 1); p is the average SNR 
at each receive antenna. 

Following the signal model in [23], the received signal can 
also be shown to be: 



with 1 = 1,2,-- 



yf 



y = yfpHs + z 
, L and 
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and 



where y e JR" 1 "-^-, s € 
H G ijj2Tjv r xL are jjjg equivalent received signal vector, infor- 
mation symbol vector, equivalent noise vector and equivalent 
channel matrix, respectively. 

To avoid rank deficiency at the decoder, ranfc(H) = L is 
required, which means that H should be "tall", i.e., L < 2TN r 
[ 23 1 1 24 1 . Therefore, we assume that the number of receiver 
antennas N r > Moreover, ] , [^] , • • • , must be 
linearly independent to guarantee rank(H) — L IU5l . 

2 The in-phase component or the quadrature component of a complex 
information symbol is real, hence, this signal model is also applicable for 
complex information symbol transmission. 



B. Code Rate of STC 

Lemma 1. j[2?l/ /l2?l/ In an N t x N r MIMO system, the code 
rate of STC applied cannot exceed the minimum of transmit 
and receive antenna numbers, i.e., 

L 



Rate 



2T 



< mm(N t ,N r ). 



(4) 



Definition 1 (Full-Rate STC). An STC for N t x N r MIMO 
systems is full-rate when its code rate achieves the value of 
mm(N t ,N r ). ■ 

In this paper, we always assume that Nt < N r , hence an 
STC is full -rate when its code rate achieves the value of Nt. 

III. Block-Orthogonal STC 

In this section, block-orthogonal STC (BOSTC) and block- 
orthogonal code property [25 1 are defined and discussed. 

A. Definition of BOSTC 

Most reduced-complexity MIMO decoders such as sphere 
decoder J26] and QR decoder with M-algorithm (QRDM) 
1 27 1 [28 1 are based on QR decomposition. With QR decom- 
position, BOSTC is defined as follows: 

Definition 2 (BOSTC). Suppose that H 2 TN r xL is the equiv- 
alent channel matrix when an STC XtxAt 4 is applied in 
Nt x N r MIMO systems. Denoting QR decomposition on 
H as: H = QR where Q = [q x • • q L ] e R 2TN ^l is 
unitary and R € R ixi is upper-triangular, X is called block- 
orthogonal STC (BOSTC) and have block-orthogonal structure 
if 



R 



Di E 12 
D 2 





E ir 

E2T 

D, 



(5) 



where the sub-block D^ is full-rank diagonal matrix of 
size kj X ki 3.S shown in ©, and the information sym- 
bols corresponding to the same sub-block are independent 
(i.e., their values represent independent information) and 
orthogonal (i.e., their dispersion matrices satisfy the quasi- 
orthogonal constraints (QOC) in iflOl ): T is the number of 
sub-blocks D's and $3 i=1 fej = L; Ei 1 i 2 (ii = 1,2,- •• ,r — 



1, «2 = h 
values. 



1. 



, T) denotes matrix containing arbitrary 



D; 



diag(v,i t i, Uj,2, 



Ui,ki) 



(6) 



In Def. |2] Di (i = 1, 2, • • • , T) in (O are diagonal matrices 
with non-zero scalar diagonal entries. If these scalar diagonal 
entries are replaced with square upper-triangular matrices such 
as: 



D, 



diag(\Ji.i, Ui 



(7) 



where LYfc are full-rank upper-triangular matrices of size 
li,k x j i>k with ELiEk=i7»,k = L, i = 1,2,- •• ,T, 
k = 1, 2, • • • , hi, then the code can be viewed as a block- 
quasi-orthogonal code (instead of block orthogonal). The 
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information symbols corresponding to the same sub-block D 
and different Us are independent and orthogonal. 

In general, the size of (block-)diagonal matrices D's and 
upper-triangular matrices U's can be arbitrary. In this paper 
only the case that D's have the same size fc x fc (i.e., k\ = 
fc 2 = ■ • • = kr = k) and U's have the same size 7x7 (i.e., 
7i,i = • ■ ■ = 7i,fe = • • ■ = 7r,i = • • • = 7r,fc = 7) is 
considered. Hence, block-(quasi-)orthogonal structure can be 
unified by three parameters as (r, fc, 7): 

• T: the number of matrices D (i.e., sub-blocks) in R; 

• k: the number of scalars u or matrices U in D's; 

• 7: the number of diagonal entries in matrices U (7 = 1 
for scalars u ). 

To simplify the notations further, in the sequel of this paper 
we will not make distinction between block-orthogonal STC 
and block-quasi-orthogonal STC. They will both be called 
"block-orthogonal STC'with parameters (r, fc,7). 

B. Block-Orthogonal Property 

In this section, we present sufficient conditions for an STC 
to attain block-orthogonal structure. 

1 ) 2-Block BOSTC: We first propose a sufficient condition 
for an STC to achieve block-orthogonal structure (T = 
2, k, 7 = 1). The case of T > 2 will be discussed subsequently. 

Theorem 1. Considering an STC of size T x N t with 
dispersion matrices Ai, • • • , Afc, Bi, • • • , B/fl Let 



*-iupl2Tx2N t 



, B t 



B 



B 



B B 



[b 



m Pl2Tx2N t 



and Ai — 

1, • • • , k, u = 1, • • • , 2T, p = 1, • • • , 2N t ), then this STC 
has block-orthogonal structure (2, fc, 1) if 



1. {Ax,- 

3 . v4.^ — 

4. BfB 3 = 

(p,g,s,t)eS 



, Ak, Bi, ■ ■ ■ ,Bk} is of dimention 2fc; 
I, B?B t =I (i = !,■■■ ,k); 
-AjAi = 1, • • ■ , k and i + 1 j); 
-B T j B l = 1, • • • , k and i + 1 j); 
dpqst = 0( i, j = 1, • • • , fc, and i ^ j) 



2T 



2T 



(8a) 
(8b) 
(8c) 
(8d) 

(8e) 



where dp qs t — ^ ' I ^ bi up a KUS ■ S bj vq a K 



— l \U— 1 



each element (tuple) of set § includes 4 uniquely-permuted 
scalars^ drawn from {1, • • • , 2N t }. ■ 

The proof of Theorem Q] is given in Appendix lAl Based on 
Theorem [T] the 2x2 fast-decodable codes in |29|-[31| can 
be shown to have block-orthogonal structure (2, 4, 1). 

3 For ease of presentation, here we employ {A} and {B} as dispersion 
matrices, instead of {C} presented in {T}. 

4 For example, £(l,2,l,l)eS dpqst = <Ziii2 +<2ll2l +dl2ll -\-d21n and 
S(l,2,3,l)eS dpqst = dll23 + dll32 + ^1213 + dl312 + ^1231 + <^1321 + 
(22113 + Q2131 + ^2311 + ^3112 + ^3121 + ^3211 ■ 



2) T -Block BOSTC (T > 2): 

Definition 3. Consider an STC with dispersion matrices 
{Ai, ■ ■ • , At} and {Bi, ■ • • ,Rk} and an associated equivalent 
channel matrix H, the matrices {Bi, ■ ■ • , B^} is said to satisfy 
block QOC (i.e., conditions (8b) and (8c)) under matrices 
{Ai, • • • , A k } if their associated R(k +1 : k + fc, k+1 : 
k + fc) is diagonal, where R is the upper-triangular matrix 
after QR decomposition of H, R(k + 1 : k + fc, k+1 : 
k + fc) is the sub-matrix constituted by the (k + l)th to 
(k + fc)th rows and the (k + l)th to (k + fc)th columns of 
R ■ 

Based on Def. [3] a sufficient condition to check whether 
an STC has block-orthogonal structure (T, fc, 1) is provided 
as follows. 

Theorem 2. Denoting the equivalent channel matrix 
of an STC with dispersion matrices {Ai, - - ,Aj<} and 
{Bi, • • • ,B fc } as H = [Hi H 2 ], Hi = [hj • • • h k ], H 2 = 
[hj {+1 ••• hk+fc], the matrices {Bi,-- - ,8^} satisfy block- 
QOC under matrices {Ai, • • ■ , Aj<} if 

1) {Bi, • • • , B fc } satisfy the QOC; 

2) the projection coefficient matrix of vectors 
h k+ i, • • • ,h k+fc onto vector space {hi,---,h k }, i.e., 
the sub-matrix formed by the first column to the k-th column 
and the (k + l)-th row to the (k + fc)-th row of B in ©, is 
para-unitarjlf]. ■ 

The proof of Theorem [2] is given in Appendix [B] Using 
Theorems [2] we can classify the block-orthogonal structure 
achieved by many existing codes, as shown in Table U We 
can see that all these codes have block-orthogonal structure 
with fc < 2. 



C. Relationship with the Existing Works 

The R's after the QR decomposition on the equivalent chan- 
nel matrices H's of group-decodable, fast-decodable structure 
and block-orthogonal STC are compared graphically in Fig. 
[T] The group decodable codes can not achieve full code 
rates liT31 . On the other hand, the fast-decodable codes|21| 
have very few zeros in the R matrix, hence limited decoding 
complexity reduction. This is the reason why we introduce 
full-rate block-orthogonal structure in this paper. 

IV. Benefit of Block-Orthogonal Structure: 
Decoding Complexity Reduction 

In this section, we first review the breadth-first search de- 
coding used in traditional QRDM [27 1|28|, and then propose a 
simplified QRDM that exploits the block-orthogonal structure. 
Next, we use the D-STTD code with block-orthogonal struc- 
ture (2,4, 1) to illustrate the decoding complexity reduction. 

Assume that the real information symbols corresponding to 
an upper-triangular matrix U are drawn from a constellation 
of size M. For example, if each real information symbol in an 
STC with block-orthogonal structure (F, fc, 7) is 4-PAM (pulse 
amplitude modulation) modulated, the symbols corresponding 
to the same U are drawn from a constellation of size M = 4 7 . 

5 A is para-unitary if \ H A. = I. 
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TABLE I 

Block-Orthogonal Structure of Existing Codes for N t Transmit Antennas over T Symbol Durations' 1 . 





Nt 


T 


r 


k 


7 


BLASTlH 


N t 


1 


N t 


2 


1 


Golden codefl9) 


2 


2 


4 


2 


1 


D-STTD codefTTl 


4 


2 


2 


4 


1 


DjABBA code[18| 


4 


4 


4 


2 


2 


Perfect codet20l 


3 (or 6) 


3 (or 6) 


9 (or 36) 


1 


2 


4 


4 


16 


2 


1 



Without special requirements (e.g., to achieve full diversity, constellation rotation is required for DjABBA 
code 1181 and HEX constellation is applied for Perfect code with 3 (or 6) transmit antennas 1201 1. we 
assume that each complex information symbol is drawn from a square QAM without constellation rotation, 
equivalently, each real information symbol is drawn from a one-dimension constellation. 



4-Group-Decodable Structure Fast-Decodable Structure Block-Orthogonal Structure Block-Orthogonal Structure 

in [12] in [22] (4,M) (4, 4, f) with t>l 




Fig. 1. R's of group-decodable structure, fast-decodable structure and block-orthogonal structure. 



A. Traditional QRDM 

In traditional QRDM, the surviving paths with smaller 
accumulated Euclidean distance (Euclidean metric) are picked 
from the full (ML decoding) or partial(near-ML decoding) 
search tree. Assume that at each stage M c paths are reserved, 
as shown in Fig. [2] where M c = 3. At the beginning, all 
the search paths are reserved until the number of total search 
paths exceeds M c ; then only M c paths with the smallest 
accumulated Euclidean metrics, surviving paths, are picked for 
the Euclidean metric calculations in the next stage. In this case, 
MM C metrics need to be calculated in each stage. Hence, the 
decoding complexity (likelihood function calculation number) 
under traditional QRDM is: 

i 1 

OTradMonal - = ^[M c < M L ~ l+1 ]M 

l=L *- -* 

• ([M c > M L - 1 } M L ~ l + [M c < M L ~ l ]M c ) , 

where [Condition) will be 1 when Condition is true, or 

when Condition is false. In (O, means that the de- 
coding complexity is averaged over the symbol durations; 

1 = L, ■ ■ ■ ,1 means that the decoding process is conducted 
from sl to si\ [M c < M L ~ l+1 ] being 1 means that the 
number of total search paths exceeds M c and Euclidean 
metrics need to be calculated. 

Note that 1) if M c = 1, traditional QRDM has successive 
interference cancelation (SIC) and (O becomes (^Traditional = 
?pLM (the first point seems useless); 2) if M c = M L ~ k , 
traditional QRDM becomes the same complexity as fast 
decoding [21 1 and the decoding complexity is reduced from 



±M L to T raditionai = ^kMM c = ±kM L ~ k+1 . In this case, 
only the orthogonality in the upper-left sub-block of the block- 
orthogonal structure is exploited. In the following, we will 
propose a simplified decoding that exploits the orthogonality 
in all the sub-blocks of block-orthogonal structure to reduce 
M c to M^ q for Euclidean metric calculations, and hence 
reducing the decoding complexity (0 further, but without 
performance loss. 

B. Simplified QRDM for Block-Orthogonal Structure 

As denoted in the dashed box of Fig. we assume that 
2 real information symbols {s p ,s p _i} drawn from a signal 
constellation with M = 4 are in a sub-block of an STC 
with block-orthogonal structure (T, k, 1) (k > 2) and this 
sub-block is not a first-decoded block. Since s p and s p _i are 
independent, the Euclidean metric calculations for s p and s p _i 
can be separated. Under QRDM with M c = 3, the Euclidean 
metric calculation number for s p is O p = M C M = 12. 
Without loss of generality, the surviving candidates for s p may 
be s p i, s Pi 2 and s p ^ (as shown in Pig. [2j based on the updated 
accumulated Euclidean distance. 

Next we calculate the Euclidean metrics for s p _i. Since 
s p and Sp-i are orthogonal to each other, s p will not affect 
the Euclidean metric calculations for Sp-%, Hence the two 
Euclidean metrics for s p _i along path 

(••■—> Sp+i,i — > s p ,i — > Sp-ij) : blue line in Fig. |2] 

and path 

( ► Sp+i,i -> s p ,2 -> Sp-i ,j) ■ green line in Fig. [2] 
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Fig. 2. Simplified QRDM trellis diagram (M = 4). 
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r - c,p-2 



500 1000 1500 2000 

M : Number of Surviving Paths (D-STTD, 8-PAM, p=4) 



are the same, which is equal to the Euclidean metric along the 
virtual path (for Euclidean metric calculation only) 



s p-i>i 



) : red dashed line in Fig. [2] 



with j = 1,2,3 and 4. Then the number of Euclidean metric 
calculation for s p _i will be reduced from M C M = 12 to 
O p -! = M* q M = 8 where M e c q = 2 is the equivalent 
surviving path number of {• • • s p +2, s p +i, s p } for s p _i, or the 
surviving path number of {• • • s p +2, Sp+i} f° r s p-i- Note that 
the red virtual path does not exist under traditional QRDM 
without considering block-orthogonal structure. Thus, un- 
der proposed simplified QRDM considering block-orthogonal 
structure, the surviving paths for Euclidean metric calculations 
are the reserved paths of these symbols {• • • s p +2, Sp+i} 
decoded in previous blocks, not the M c reserved paths of all 
these symbols {• • • s p +2, s p +i, s p } decoded previously. Note 
that with the use of virtual path, we can reduce the Euclidean 
metric calculation number, but will not reduce the surviving 
path number. Hence, the decoding complexity reduction will 
not cause any loss in BER performance at all. Without loss of 
generality, after updating the accumulated Euclidean distance, 
M c = 3 surviving paths in this stage may be 

(1) • • • -> Sp+i,i -> s P) i -> Sp_i i2 : blue line in Fig. |2] 

(2) ■ • • -> Sp+i,i -> s p .2 -> Sp-1,2 : green line in Fig. |2] 

(3) ■ • • — > Sp+1,2 — > s p a — > Sp-i,4 : pink line in Fig. [2] 



Suppose that {s p , s p -i, • • ■ ,s p -k+i} are in the same 
sub-block of block-orthogonal structure (r, fc, 1) and 
their equivalent surviving path numbers are denoted as 
M^ p , M^, • • • ,M°*_ fc+1 , respectively. Then, instead of 
(O, we can rewrite the decoding complexity under proposed 
simplified QRDM as: 



O 



Simplified 



• ([Af e > M 
It is easy to see that M c 



Zj — ( 1 ^" Z/ — / 



(10) 



[M c < M L - l ]M e c \ 



M e i > M eq 

lvl c,p #* lvl c,p-l 

M^ q p _ k+1 . Hence, the decoding complexity of an STC with 
block-orthogonal structure of > 2 under proposed simplified 
QRDM is lower than that under traditional QRDM. 



Fig. 3. Aff* _ 4 (i = 0, • • • , k - 1, p = 4, k = 4) for the D-STTD code 
where each real information symbol is drawn from 8-PAM. 



1) Decoding Complexity Reduction Bound: Considering a 
sub-block {s p , Sp-i, ■ ■ ■ , Sp-k+i} of block-orthogonal struc- 
ture (r, k, 1), it is easy to see that M^ q p _ i achieves the 
minimum value of -jgf [i = 0, • • ■ , k — 1) when each node in 
the reserved decoding search tree has M children and hence 
as few parent nodes as possible are reserved. The minimum 
simplified decoding complexity of this sub-block can be shown 
to be (assume that M c < M L ' P ) 

fc-i 



O, 



part Simplified,min 



p-fc+1 

E 

i—p 



M k 



MM 



1 



eq _ 



E 



MM C 



M k - M k ~ 



-MM r 



M 



M - 1 



-MM r 



Compared with the traditional per-sub-block decoding com- 
plexity kMM c ( when M c < M L ~ P ), the simplified decoding 
complexity of this sub-block can even be reduced to k ^/_^ ■ 
Note that this result is available to all non-first-decoded 
sub-blocks. For the block-orthogonal structure (F, k, 7) with 
7 > 1, the 7 information symbols corresponding to an upper- 
triangular matrix U should be viewed as a unit drawn from a 
constellation of size A/ 7 , instead of M. Hence we can make 
the following remark: 

Remark 1. Compared with the traditional decoding, the 
maximum amount of decoding complexity reduction of block- 
orthogonal structure (T, k, 7) with simplified decoding is 
k(M~i-Vi a PP rox i ma tely, which is a decreasing function of k 
and M 



V 



2) Decoding Complexity Reduction Example: From (1 1 0b , 
we can see that the decoding complexity is mainly determined 
by the equivalent surviving path number M eq v Since the actual 
Mf? value can only be estimated experimentally, simulations 
are conducted in a 4x2 MIMO system where the D-STTD 
code with block-orthogonal structure (2,4,1) is applied, and 
MT 1 , of si(l = 4, • • • , 1) in the non-first-decoded sub-block 
are enumerated. In the simulations, the communication chan- 
nel is assumed to be quasi-static Rayleigh fading and the 
channel state information is perfectly known at the receiver. 

Experiment I: Assume that each real information symbol 
in the D-STTD code is drawn from 8-PAM, the equivalent 
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^--H^ M eq ,/M 

c,p-1 c 








0.2935 
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3 — e „ M*" JM 

^* c 


0.1384 






0.081 13. 



1000 1500 2000 2500 3000 3500 4000 
M : Number of Surviving Paths (D-STTD, 8-PAM, p=4) 

Fig. 5. M^ q p _ i /M c (i = 1, • • ■ , k - 1, p = 4, k = 4) for the D-STTD 
code where each real information symbol is drawn from 8-PAM. 



surviving path numbers under different SNR values 4dB, 
14dB, 24dB are shown in Fig. [3] We can see that all the 
equivalent surviving path numbers M^ q p _ i (i — 1, • • • , fc — 
1, p = 4, fc = 4) are much smaller than M c . 

Experiment II: Assuming that each real information symbol 
in the D-STTD code is drawn from 4-PAM and 8-PAM in 2 
simulations, respectively. The complexity reduction results of 
M^ q /M c are shown in Fig. [4] and Fig. [5] where we can see 
that M^ q p _jM c is far smaller than 1, and it decreases with 
increasing i (i = 0, • • • , k — 1, k = 4) and M (i.e., M 7 ). 

Remark 2. In the proposed simplified QRDM, the decoding 
complexity ( fTQb decreases with increasing fc and M 1 , and the 
maximum complexity reduction order concurs with RemarkQ] 

C. Decoding Complexity Comparisons 

Under traditional and proposed simplified QRDM's, the 
decoding complexities of the D-STTD code El, the Dj ABBA 
code [18] and the Perfect code l20l with block-orthogonal 
structures (2,4,1), (4,2,2) and (16,2,1) respectively in a 
4x4 MIMO system are compared in Fig. [6] where each real 
information symbol is drawn from 16-PAM, 4-PAM and 4- 



PAM, respectively. We emphasize that all codes will have ex- 
actly the same BER performance under both QRDM schemes 
because the proposed Simplified QRDM only reduces the 
number of Euclidean metric calculations but not the surviving 
path number (as explained earlier in Section IV-B). From Fig. 
[6] we can see that the decoding complexity under proposed 
simplified QRDM can be reduced drastically. In particular, the 
complexity reduction for the D-STTD code is nearly 50%. 

Moreover, the simulation also shows that: 1) the D-STTD 
code achieves more decoding complexity reduction than the 
DjABBA code. That is because compared to the DjABBA 
code with fc = 2, the D-STTD code has larger fc = 4 (both 
have the same M — 16); 2) the DjABBA code achieves more 
decoding complexity reduction than Perfect code because the 
DjABBA code has a larger M — 4 2 — 16 than the Perfect 
code which has M = 4 1 — 4 (both have the same fc = 2). 
Both the observations concur with Remark 

Note that although the proposed Simplified QRDM achieves 
a lower decoding complexity, its BER performance remains 
the same as the traditional QRDM because the surviving path 
number of both schemes remain the same (recall explanation 
in Section IV-B). Hence, BER comparisons under traditional 
and proposed simplified QRDM's are unnecessary and omit- 
ted. 

V. New BOSTC Construction 

Although we have shown that many existing high-rate STCs 
have some block-orthogonal structure, there are new open 
problems: 

1) The conventional approaches to high-rate code design 
tend to focus on the error rate performance criteria and always 
ignore decoding complexity, hence they may not achieve the 
best performance-complexity trade off. We can see that for 
most existing codes in Table HI fc is 2 hence the decoding 
complexity reduction under proposed simplified QRDM is 
limited. Furthermore, the Perfect code with 3 and 6 transmit 
antennas can not benefit from simplified QRDM decoding due 
to k = 1; 

2) Many existing BOSTC have low scalability. For example, 
the maximum code rate of DjABBA code is 2. 
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Therefore in this section, we will construct new BOSTC's 
which better exploit the block-orthogonal property and are 
more scalable. 

A. Construction 

To reduce the decoding complexity, the BOSTC should be 
designed for large k. Two such systematic construction rules 
are presented here. 

1) Construction I with Rate-1 Seed Code: Select a rate-1 
space-time code X .TxN t with high k to be the seed code, 
and a full-rank matrix MN t xN t = [mi ni2 • • • mjv t ] to be 
the extension matrix, then a rate-iVt (full-rate) BOSTC can be 
constructed as 

N t 

Xl,AT t = ^2 X <V ' dia g( m (H) 

i=l 

where X D .j is the X Q with different sets of information 
symbols. It is easy to prove that the decoding of Xuv t is 
not rank deficient if there is at most a complex information 
symbol at each space-time position of the seed code X 0) xxiV { - 

2) Construction II with Rate- 1/2 Seed Code: Select a rate- 
1/2 space-time code X .TxN t with high k to be the seed code, 
and a matrix Mjv t x2iVt = [mi 1112 • ■ ■ m2i\r t ] with full-rank 
[m 7 ] ^ e the extension matrix, then a rate-7V t BOSTC can be 
constructed as 

2N t 

Xn )A r { =X] X <M -diag(m<) (12) 

i=l 

where X .j is the X Q with different sets of information 
symbols. It is easy to prove that the decoding of Xn,jv t is not 
rank deficient if there is at most a real information symbol at 
each space-time position of the seed code X 0) xxiV { - 

B. Examples 

BOSTC examples with k = 4 and k — 8 are presented 
in the following. To our knowledge, the k = 8 code has the 
largest k value ever reported. 

First we review Hadamard matrix. A complete set of 2 m 
Walsh functions of order m gives a Hadamard matrix M2™ 
[321 as follows: 





' 1 1 




M 2 = 


1 -1 


, M 4 = 



M 2 
M 2 



M 2 
-M a 



(13) 



Denote Mjy as a square sub-matrix of in ( [TBI formed 
by the first N columns and the first N rows. For example, 

111 

I- 11 

II- 1 



M 



3 — 



Example 1 (fixed dimension): (8, 4, l)-BOSTC for 4 transmit 
antennas 

Let the seed code be the rate-1 jABBA code: 



si +js 2 
-S3+JS4 
S5 + jse 

~S7+js S 



S3 
S5 



JS4 
jss 

JS6 



3S5- s 6 
-JS7 - s s 

si + js 2 
-S3+JS4 



JS7 - s$ 
3S5 + s 6 
S3 + js± 
si - js 2 



l,m 



-ll.rn 



'l,3,m 



1 


m. 




1 






1 


C4J, m. 






















y 






Y 






'y 







3,m 



1 Cs,3,m 



A 1 Cjj im 



y \^-"ij jH 



"l,2" _1 ,IH 



"2 T* ) 



Fig. 7. Order of picking dispersion matrices (7 = 2 m n , n £ [1, m], 
m > 1) in Example 2. Here 7 = 2 for illustration purpose. 



Then a rate-4 STC X14 for 4 transmit antennas can be 
constructed as 

4 

Xi, 4 = 51 X °- 1 ' dia s( m *) ( 15 ) 

i=l 

where X ,j is the X Q with different sets of information symbols 
{si,i, • • • , S8,i} and irij is the zth column of Hadamard matrix 
M4 as shown in ( fT6l ). 



M 4 



1111 

1-11-1 
1 1 -1-1 
1-1-1 1 



(16) 



Following Theorem [2] X14 can be verified ||33ll to have 
block-orthogonal structure (8, 4, 1), where {si,i, • • • , 54.^} and 
{s5,i,-'' , s s.i} are in the (2i — l)th and (2i)th sub-blocks, 
respectively. Moreover, the block-orthogonal structure is main- 
tained even if any of these sub-blocks are removed, and 
hence Xi, 4 can be a (T, 4, l)-BOSTC of code rate T/2 (T = 
1, 2, • ■ • ,8) with (8 - T) sub-blocks removed. 

Example 2(scalable dimension): (2 m+ "- 1 , 4, 2 m ~ n )-BOSTC 
for 2 m transmit antennas (m > 1 integer, n E [1, rn]) 

Let Q i ) i(Z = 1,2,3 and 4) be the dispersion matrices of 
Alamouti code [1| with j 2 = —1: 



1 
1 



J 

-j 



1 

-1 



j 
J 



(17) 



Then the dispersion matrices of a rate-1 STC for 2 m (m > 
1 integer) transmit antennas can be presented as: 

Cl.k,m~l 

Cl,k,m-1 

C;,fe, m _i 



Cj,2k,-. 



(18) 



(14) 



where k = 1, • • • , 2" l ~ 2 , I = 1, 2, 3 and 4. 

The rate-1 STC with dispersion matrices in (TTTb or ( TT8l 
with the picking order shown in Fig [7] is denoted as X„. 
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Let the seed code be X Q and the extension matrix M be the 
Hadamard matrix of size 2 m x 2"\ a rate-2 m STC Xi, 2 m can 
be constructed following Construction I: 



Xl, 2 ™ = 5Z X °> ,; ' dia §( m *) 



(19) 



where X D .j is the rate-1 STC X Q with different sets of 
information symbols and is the ith column of Hadamard 
matrix M 2 ™> . 

Following Theorem [2] Xi l2 ™ can be verified l33l to 
have block-orthogonal structure (2 m+n_1 , 4, 2 m ~ n ) with n G 
[l,m], where {sj,(k_i) 7+ i, m ,i, • • • , Si,fcT. m ,i} corresponds to 
U Pii (I = 1,2,3 and 4) in the pth sub-block (k = 1, • • • , 2™~\ 
i = 1, • • • , 2 m , p = 2"- 1 (i - 1) + fc). Moreover, the block- 
orthogonal structure is maintained even if some sub-blocks 
are removed, hence Xi, 2 m can be a (T, 4, 2 m -")-BOSTC of 
code rate 2 1 -™T (T = 1, • • • ; 2 m +"- 1 ) with (2 m +"- 1 - T) 
sub-blocks removed. 

Using the rate-1/2 real orthogonal STC in (2) as the seed 
codes, BOSTC can be obtained following Construction II as 
follows. 

Example 3: (10, 8, l)-BOSTC for 5 transmit antennas 
Let the seed code be 



X Q = 



Sl 


S 2 


S3 


s 4 


S5 


-S2 


Sl 


s 4 


-S3 


S6 


-S3 


-s 4 


Sl 


S2 


S7 


-s 4 


S3 


-s 2 


Sl 


S8 


-S5 


~S6 


-S7 


-ss 


Sl 


-86 


S5 


-sg 


S7 


-S2 


-97 


S8 


S5 


-S6 


-S3 


-sg 


-S7 


S6 


S5 


-S4 



(20) 



and the extension matrix be 



M 



-1 1 
-1 
1 
1 
1 



J 



1 3 
1 1 



1111 
111 



3 1 1 



1 1 1 j 1 
-1 1 1 1 1 j 



(21) 



a rate-5 STC Xn,5 can be constructed following Construction 
II: 



10 



Xn,5 = ^2^o,i ■ diag(mi) 



(22) 



where X D i is X Q in (|20] | with different sets of information 
symbols and is the ith column of M in ( 1211 . 

Following Theorem [2] Xn,5 can be verified to have block- 
orthogonal structure (10,8,1), where {sx,i, • • • , sg^} are in 
the ith (i = l, - - ,10) sub-block. Note that the block- 
orthogonal structure is maintained even if some sub-blocks are 
removed, hence Xn,5 can be a (T, 8, l)-BOSTC of code rate 
T/2 (r = 1, 2, • • • ,10) with (10 - T) sub-blocks removed. 

The newly constructed BOSTC are summarized in Table [II] 
Interestingly, the Xn,5 code found using Construction II has 
a higher k value (=8) than those found using Construction I. 
Xn 5 is also the first ever k = 8 code. 



C. Optimization 

To compare with DjABBA code (rate 2) and DSTTD code 
(rate 2), we will show a rate-2 BOSTC with optimization in 
the following. 

Denoting X D in O as X Q = X Ql (si, s 2 , s 3 , s 4 ) + 
X 02 (s 5 ,s 6 ,s 7 ,s 8 ), a rate-2 full-diversity (4, 4, l)-BOSTC 
Xi,rate-2 with optimized design coefficients can be presented 
as 

1 2 

XLrate-2 = X) XI Xo "^+! ' dia §(P2 l +n) ' diag(m 2l+ i) (23) 

i=0 n=l 

where X 0ii J+ i is the X 0ii with different sets of information 
symbols, m : is the ith column vector of Hadamard matrix 
M4 and the design coefficient matrix P can be obtained from 
computer search as 

"1111 
1111 

e\ ex ex e\ 
ex ex ex ex 



where e± 



Pi 
P 2 
P3 
P 4 

e J"0.3218^ 



VI. Simulations and Discussions 



In this section, we compare the BER performances of 
the optimized XLrate-2 in ( l23l with the existing rate-2 codes 
such as D-STTD code JT7] and DjABBA code flT8l in 4x2 
MIMO systems. We consider the DjABBA code optimized in 
Chapter 9 of fl8l . which is the best known rate-2 code to our 
knowledge. 

In the following simulations, the proposed simplified 
QRDM as described in Section IIV-BI is applied as described, 
and all the rate-2 codes are modulated by 16-QAM (hence 8 
bits/channel use). We assume that the channel is quasi-static 
Rayleigh fading, and the channel state information (CSI) is 
known at the receiver perfectly. 

A. BER Performance against SNR with Given Decoding Com- 
plexities 

From Remark [2] we can see that with a given surviving 
path number in QRDM, The D-STTD code and the proposed 
XLrate-2 in (l23l with k — 4 can bring more decoding com- 
plexity reduction than the DjABBA code with k = 2. In 
other words, with a given decoding complexity, the D-STTD 
code and the proposed XLrate-2 support larger surviving path 
numbers than the DjABBA code. As shown in Table EH, we 
simulate 2 cases in TablelHIlwriere Case I considers a decoding 
complexity O of around 180, while Case II allows a higher 
decoding complexity O of around 620, for all the D-STTD, 
DjABBA and proposed XLrate-2 codes. The complexity order 
is computed using (TTOb . 

table ni 

QRDM Parameters for Rate-2 Codes: Decoding 
Complexity O and Surviving Path Number M c . 





Case I 


Case II 


M c 


O 


M c 


O 


D-STTD code 


20 


189 


102 


622 


DjABBA code 


7 


208 


28 


627 


Xr.rate-2 in l|23l> 


16 


183 


64 


614 
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TABLE II 

Comparison of BOSTC for N t Transmit Antennas over T Symbol Durations' 1 . 



BOSTC 


N t 


T 


Rate 


r 


k 


k 


Xi, 4 in O 


4 


4 


4 


8 


4 


1 


Xi, 2 ™ in {L9]! 6 


2 m 


2 m 


2 m 


Qm+n—l 


4 




Xn,5 in (22) 


5 


8 


5 


10 


8 


i 



Assume that each complex information symbol is drawn from a square QAM without constellation 
rotation, or each real information symbol is drawn from an one-dimension constellation equivalently; 
m is an integer > 1, and n is an integer no larger than m. 




+ - Case 1: D-STTD 
— I — Case 2: D-STTD 

8 - Case 1 : DjABBA 
-e — Case 2: DjABBA 

a - Case 1 : Proposed BOSTC 
-a — Case 2: Proposed BOSTC 



- D-STTD 
- DjABBA 
-a — Proposed BOSTC 
A Complexity saturation points 




SNR 



Decoding Complexity 



Fig. 8. BER against SNR with comparable decoding complexities in 4x2 
MIMO systems with 8 bits/channel use. 



Fig. 9. BER curves against decoding complexity with a given SNR : 
in 4x2 MIMO systems with 8 bits/channel use. 



22dB 



The BER curves against SNR are plotted in Fig. [8] We 
can see that with similar or slightly lower decoding complex - 
ity(Table lllfl . Xi, at e-2 proposed in d23l > outperforms both the 
D-STTD code and the DjABBA code. This is because Xi >rate _2 
has higher diversity than the D-STTD code, and supports 
larger surviving path number than the DjABBA code(see Table 
HUll. 

B. BER Performance against Decoding Complexity with 
Given SNR Value 

From Fig. [8] we can see that the BER performance of STC 
decoded using QRDM decoder is a function of the decoding 
complexity. Interestingly, this function is non-linear. For in- 
stance, with similar decoding complexities, the D-STTD code 
performs better than the DjABBA code in Case I, but worse 
than the DjABBA code in Case II. Hence in this subsection 
we will study the relationship between BER performance and 
decoding complexity under a given SNR value. 

The BER curves against decoding complexity with SNR = 
22 dB are plotted in Fig. [9] We can see that 1) at different 
decoding complexity level, the best performance is achieved 
by different codes. Xi ia te-2 performs the best for most parts 
of the decoding complexity range, and specifically when 
the decoding complexity order is lower than 10 3 . Therefore, 
the proposed BOSTC is a better choice for systems with 
limited computational power; 2) when the BER curves become 




- SNR=20dB 

- SNR=22dB 

- SNR=24dB 
Saturation points 



10 

M in QRDV 



Fig. 10. BER curves against decoding complexity with a given SNR = 
in 4x2 MIMO systems with 8 bits/channel use. 



22dB 



fiat, the QRDM performance approaches the ML decoding 
performance, although the practical decoding complexity is 
far lower than the ML decoding complexity. We call such 
minimum practical decoding complexity for ML decoding 
performance the complexity saturation point and denote it 
as "A"in Fig. [9] and Fig. [TOl 
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C. Complexity Saturation Point 



where 



From Fig. [9J we can see that the codes can achieve near ML 
decoding performances with a much lower practical decoding 
complexity (i.e., complexity saturation point) than full ML 
decoding complexity. When the practical decoding complexity 
exceeds the complexity saturation point, the improvement on 
BER performance is trivial. This is a desirable property in 
high-rate MIMO communication systems. 

In Fig.|9j the complexity saturation points are obtained with 
a given SNR = 22 dB. To verify the stability of a code's 
complexity saturation point, the BER curves of the proposed 
BOSTC X Ijrate -2 with different SNR are plotted in Fig. [TU] We 
can see that the complexity saturation points are almost the 
same, at about M c = 128. This is clearly desirable too. 



VII. Conclusions 

In this paper, we introduce a new code property, called 
block-orthogonal property, for space-time codes (STC), and 
propose a new Simplified QRDM decoder to achieve sig- 
nificant decoding complexity reduction over the traditional 
breadth-first-search QRDM decoder for many well known 
high-rate STCs such as the D-STTD, DjABBA and Perfect 
codes. We prove that the proposed Simplified QRDM has 
absolutely no performance loss over the traditional QRDM, 
because the Simplified QRDM reduces only the number of Eu- 
clidean metric calculations but not the surviving path number. 
We also derive the maximum achievable complexity reduction 
in terms of the block-orthogonal parameters. To further exploit 
the block-orthogonal property, we construct new BOSTC with 
better complexity reduction advantage, and we show how to 
optimize them for full diversity and maximum coding gain 
without affecting the block orthogonal code structure. The 
proposed BOSTC construction rules are scalable, and they 
support arbitrary number of transmit antennas. Simulations 
of BER against SNR and against decoding complexity show 
that the proposed BOSTC outperforms the best known rate-2 
STC under almost all scenarios (except at full ML decoding 
complexity level), and it requires a QRDM complexity level 
much lower than the full ML decoding complexity level to 
achieve near-ML decoding performance. 

Finally, we remark that the decoding complexity reduction 
principle of block-orthogonal code structure presented in both 
1 34 1 and this paper is applicable to both breadth-first search 
and depth-first search decoders. Hence, many benefits seen in 
this paper can also be expected for sphere decoding l26l . 



Appendix A 

Following the signal model (O, the equivalent channel 
matrix in an N t x N r MIMO system with the channel matrix 
Ha^xav = [hi h 2 • • • h Nr ] is 



H 



2TN r 



= [Hi H 2 ] 
[ 4h • 



[hi • • • h fc h fe+ i 
^ifi • • 



h 2 fc] 

■%h 



h i 

-j 

hi 



sti = 



A 

Ai 










N r xN r 



B, ••• 

o a ••• o 







Bi 



N r xN r 



Due to (ISbl we have ^ ^ = I, SSj 3Si = I (i = 1, • • • , k), 
and |hi| = |h 2 | = ••• = |h 2fc | = |h|; 

Due to (l8c| i, an STC with dispersion matrices Ai, • • ■ , A^ 
are orthogonal and hence its equivalent channel matrix Hi 
|h| 2 I (a detailed proof can be found 



Qi 

R 



we have H 2 H 2 
QR with Q = 

<lfcJ = jcfHi and Q 2 = [Qfc+i 
is full-rank due to d8ak Ri 



= |h| 2 I. 

[Qi Qa], 
••• Qa*]; 



hi 



kxk 



satisfies Hi Hi 
in |fl~3)). Similarly, due to 
Under QR decomposition, H 

hi 

Ri E 
R 2 
due to"©; E = Q^H 2 . 

In the following, we will prove that R 2 is diagonal and 
hence this STC has block-orthogonal structure (2, fc, 1). We 
can see that 

H 2 = QiE + Q 2 R 2 
H 2 QiE = Q 2 R 2 
(H 2 - QiE) T (H 2 - QiE) = R 2 r Q 2 r Q 2 R 2 

(where Q^Q, = I) 
|h| 2 I + E T E - H^QiE - E T Qf H 2 = R^R 2 

(where Qf H 2 = E) 
|h| 2 I - E T E = R^Rs 

In other words, E T E is diagonal <^=> R^R 2 is diagonal. 
Since R 2 is upper triangular, R^R 2 is diagonal <^ R 2 is 
diagonal. Hence, in the following, we will prove that E J E is 
diagonal under the condition (O. 

Since E = Q^H 2 , we have 



E 



E T E 



1 T 

|h| Hl H2 
1 

|hi 
1 

w 

1 

w 



-H 2 HiHi H 2 



hfc +i HiHfh fc+j 



h T ^fHiHi^h 



To ensure that E T E = ttW 



we need 



h|2 



HiHi 



is diagonal, 



(25) 
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Let — [au a;2 • • ■ EtizrjvJ — [ a iuv]2TN r x2N t N r and 
3§i = [ha b l2 • • • bj2iv t iv r ] — [biuv]2TN r x2N t N r , we have 

$i HiH^^j = [w pq ]2N t N r -x2NtN r 

= @J [Mh • ■ • ^4h] 



with 



w P9 = hf p [Mh • • • »e4h] [Mh • • • «4h] T b i9 

'27W r 2TN r 

, u—1 u—1 

'2TN T 2TN r 1 T 

^ bivq&>lv^ ' ' ' ^ bjvq^kv^ 
v—1 v—1 
k 2TN r 2TN r 

K — 1 U—1 V — 1 

k /2TN r 2TN r \ 

= h I 2_, biup&Ku ' bj vq & KV J h. 



K — 1 \ U—1 V — 1 

For a clear presentation, we define 



k /2TN T 2TN, 

Hj—1 \ u—1 u—1 
: D pg = [dpq S t]2N t N r x2N t N. 



where 



A; /2TN r 



2TN r 



dpqst — ^ ^ ( ^ ^ b-iupQ'Kus' ^ ^ bj V qO, Kv t J . 

At— 1 \ u—1 u—1 / 

With i,j = l,"-,k,ij^j and h = [hi ■ ■ ■ h2N t N r ] T , 
for condition $25[ to be valid, we first simplified the term 
\y Ml HiH^-h as follows: 

h T «^fHiHf^-h 

— T — 
=h [ui p9 ]2Af t AT r x2Af t JV T .h 

2JV t JV,. 2JV t JV r 

p=l g=l 

2iV t iV r 2N t N r 

= £ ^ £ v fiT (26) 

fc /2TN r 2TN T \ 

I J]] hupS^u ■ J]] bjyq&Kv J h 

At — 1 \ U — 1 U — 1 / 

2JV4JV,. 27V t 7V r 2JV t iV r 2N t N r 

= ^E 5E ^ s ^* ' dpqst- 

p=l 3=1 s =l t = l 

Since ft, p , /i g , /i s and h t are random channel coefficients, for 
h T ^f HiH^oh, i.e., (f26l), being 0, all the coefficients of the 



polynomial £ 9=1 ^E P= i K E«=i fe 



2N t N r 



^2N t N r 



^2N t N T 



dpqst should be 0, i.e.. 



^E ^pgst — 

(p,q,s,t)£§>o 



(27) 



where each element (tuple) of set §o includes 4 uniquely- 
permuted scalar^ drawn from {1, ••• , 2iV t iV r } and corre- 
sponds to a term h p h q h s h t with coefficient J2( P q s t)& dpqst- 
Since srf K (J$j) is block-diagonal with the same main diag- 
onal sub-matrix A K (Bi, K,i = 1, • • • , fc), there must be at 
least one value between bi up and a KUS , i.e., bi up a KUS = 0, 
when p and s correspond to two diagonal sub-matrices, i.e., 
L^-J 7^ Lt^J w i m the floor function [-J- Hence, p and s 
can be considered to be corresponding to the same sub-matrix 
A K (Bi). Hence (|27T i is equivalent to 



(p,g,s,t)es 



(28) 



where each element (tuple) of set § includes 4 uniquely- 
permuted scalars drawn from {1, ■ • • , 2./V f }. 



Hence, with ©, E E 



l 

|h| 2 



HiHj 



is di- 



agonal, i.e., R2 is diagonal. Since Ri and R2 are diagonal, 
Theorem (Q]i is proved. 

Appendix B 

Since {Bi, • • • ,B fc } satisfy the QOC, H^E^ is diagonal. 
Under QR decomposition, 



H = [Hi H 2 ] = QR = [Qi Q 2 ] 



Ri 




E12 
R 2 



(29) 



where E12 is the projection coefficient matrix of vectors 
h; { ..i . • • • ,hk+fc onto vector space {hi,-- - ,hk}. Following 
the QR decomposition algorithm, we see that E12 
In j29l, we have 



QiH 2 . 



H 2 

(H 2 — QiEi 2 ) T (H 2 - 

hTh 2 



H 2 - QiE 12 + Q 2 R 2 

- Q1E12 = Q 2 R 2 

Q1E12) = (Q 2 R2) T (Q 2 R2) 

- Ef 2 E 12 - R^R 2 (Q^Qs = I) 



Hence, with diagonal H 2 H2, we have: E E is diagonal 
^ R 2 r R2 is diagonal 4=» R2 is diagonal, where R2 has been 
known to be upper triangular. 

Hence Theorem [2] is proved. 
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